Analysis_yhc

2024-11-20

library(readr)
library(dplyr)
library(stringr)
library(ggplot2)
library(lubridate)
library(tidyr)
library(shiny)
library(plotly)

Import and clean data

data=read_csv("./data/weekly_deaths_by_state_and_causes.csv") 
general_data <- data |>
  janitor::clean_names() |>
  rename_with(~ str_replace_all(., " ", "_")) |>
  filter(jurisdiction_of_occurrence =="United States") |>
  rename_with(~ make.unique(str_replace(., "_\\w\\d.*", ""))) |>
  mutate(month = month(week_ending_date)) |>
  rename( covid_multiple_cause=covid,
          covid_underlying_cause=covid.1,
           symptoms_not_classified=symptoms_signs_and_abnormal_clinical_and_laboratory_findings_not_elsewhere_classified
          )


p_data <- read.delim("./data/Population by States.txt", 
                     header = TRUE, stringsAsFactors = FALSE) |>
  janitor::clean_names()

population_summary <- p_data |>
  filter(year_code >= 2020 & year_code <= 2023) |>
  group_by(year_code)  |> 
  summarise(Total_Population = sum(population, na.rm = TRUE)) |>
  rename(mmwr_year=year_code)

Thoughts about framework

What can we learn from death?

Have you ever thought about which week you are most likely to die in a year? Time Trend Analysis: By examining the fluctuations in death rates across the 52 weeks of a year, we can identify peak mortality periods and try to find underlying causes (death peaks and seasonal illnesses, public health crises).

Weekly trends might look different depending on where you are, as mortality rates are influenced by both environmental and healthcare system factors.

National and Regional Level Analysis Explore how death might change across time and regions.

City-Level Analysis Narrowing it down to a city like New York